Search Result

Select

Infrared dim small target tracking method based on Siamese network and Transformer

Chenhui CUI, Suzhen LIN, Dawei LI, Xiaofei LU, Jie WU

Journal of Computer Applications 2024, 44 (2): 563-571. DOI: 10.11772/j.issn.1001-9081.2023020167

Abstract （93）

HTML （2）

PDF （3513KB）（58）

Save

A method based on Siamese network and Transformer was proposed to address the low accuracy problem of infrared dim small target tracking. First， a multi-feature extraction cascading moduling was constructed to separately extract the deep features of the infrared dim small target template frame and the search frame， and concatenate them with their corresponding HOG features at the dimension level. Second， a multi-head attention mechanism Transformer was introduced to perform cross-correlation operations between the template feature map and the search feature map， generating a response map. Finally， the target’s center position in the image and the regression bounding box were obtained through the response map upsampling network and bounding box prediction network to complete the tracking of the infrared dim small targets. Test results on a dataset of 13 655 infrared images show that compared with KeepTrack tracking method， the success rate is improved by 5.9 percentage points and the precision is improved by 1.8 percentage points； compared with TransT （Transformer Tracking） method， the success rate is improved by 14.2 percentage points and the precision is improved by 14.6 percentage points. The proposed method is proved to be more accurate in tracking infrared dim small targets in complex backgrounds.

Table and Figures | Reference | Related Articles | Metrics

Select

Deep bi-modal source domain symmetrical transfer learning for cross-modal retrieval

Qiujie LIU, Yuan WAN, Jie WU

Journal of Computer Applications 2024, 44 (1): 24-31. DOI: 10.11772/j.issn.1001-9081.2023010047

Abstract （232）

HTML （4）

PDF （2170KB）（136）

Save

Cross-modal retrieval based on deep network often faces the challenge of insufficient cross-training data， which limits the training effect and easily leads to over-fitting. Transfer learning is an effective way to solve the problem of insufficient training data by learning the training data in the source domain and transferring the acquired knowledge to the target domain. However， most of the existing transfer learning methods focus on transferring knowledge from single-modal （like image） source domain to cross-modal （like image and text） target domain. If there is multiple modal information in the source domain， this asymmetric transfer would ignore the potential inter-modal semantic information contained in the source domain. At the same time， the similarity of the same modals in the source domain and the target domain cannot be well extracted， thereby reducing the domain difference. Therefore， a Deep Bi-modal source domain Symmetrical Transfer Learning for cross-modal retrieval （DBSTL） method was proposed. The purpose of this method is to realize the knowledge transfer from bi-modal source domain to multi-modal target domain， and obtain the common representation of cross-modal data. DBSTL consists of modal symmetric transfer subnet and semantic consistency learning subnet. With hybrid symmetric structure adopted in symmetric modal transfer subnet， the information between modals was more consistent to each other and the difference between source domain and target domain was reduced by this subnet. In semantic consistency learning subnet， all modalities shared the same common presentation layer， and the cross-modal semantic consistency was ensured under the guidance of the supervision information of the target domain. Experimental results show that on Pascal， NUS-WIDE-10k and Wikipedia datasets， the mean Average Precision （mAP） of the proposed method is improved by about 8.4， 0.4 and 1.2 percentage points compared with the best result obtained by the comparison methods respectively. DBSTL makes full use of the potential information of the dual-modal source domain to conduct symmetric transfer learning， ensures the semantic consistency between modals under the guidance of the supervision information， and improves the similarity of image and text distribution in the public representation space.

Table and Figures | Reference | Related Articles | Metrics

Select

Ultra-short-term photovoltaic power prediction by deep reinforcement learning based on attention mechanism

Zhengkai DING, Qiming FU, Jianping CHEN, You LU, Hongjie WU, Nengwei FANG, Bin XING

Journal of Computer Applications 2023, 43 (5): 1647-1654. DOI: 10.11772/j.issn.1001-9081.2022040542

Abstract （507）

HTML （17）

PDF （3448KB）（447）

Save

To address the problem that traditional PhotoVoltaic （PV） power prediction models are affected by random power fluctuation and tend to ignore important information， resulting in low prediction accuracy， ADDPG and ARDPG models were proposed by combining the attention mechanism with Deep Deterministic Policy Gradient （DDPG） and Recurrent Deterministic Policy Gradient （RDPG）， respectively， and a PV power prediction framework was proposed on this basis. Firstly， the original PV power data and meteorological data were normalized， and the PV power prediction problem was modeled as a Markov Decision Process （MDP）， where the historical power data and current meteorological data were used as the states of MDP. Then the attention mechanism was added to the Actor networks of DDPG and RDPG， giving different weights to different components of the state to highlight important and critical information， and learning critical information in the data through the interaction of Deep Reinforcement Learning （DRL） agents and historical data. Finally， the MDP problem was solved to obtain the optimal strategy and make accurate prediction. Experimental results on DKASC and Alice Springs PV system data show that ADDPG and ARDPG achieve the best results in Root Mean Square Error （RMSE）， Mean Absolute Error （MAE） and R². It can be seen that the proposed models can effectively improve the prediction accuracy of PV power， and can also be extended to other prediction fields such as grid prediction and wind power generation prediction.

Table and Figures | Reference | Related Articles | Metrics

Select

Semi-supervised knee abnormality classification based on multi-imaging center MRI data

Jie WU, Shitian ZHANG, Haibin XIE, Guang YANG

Journal of Computer Applications 2022, 42 (1): 316-324. DOI: 10.11772/j.issn.1001-9081.2021010200

Abstract （306）

HTML （10）

PDF （780KB）（73）

Save

The manual labeling of abundant data is laborious and the amount of Magnetic Resonance Imaging （MRI） data from a single imaging center is limited. Concerning the above problems， a Magnetic Resonance Semi-Supervised Learning （MRSSL） method utilizing multi-imaging center labeled and unlabeled MRI data was proposed and applied to knee abnormality classification. Firstly， data augmentation was used to provide the inductive bias required by the model . Next， the classification loss and the consistency loss were combined to constraint an artificial neural network to extract the discriminative features from the data. Then， the features were used for the MRI knee abnormality classification. Additionally， the corresponding Magnetic Resonance Supervised Learning （MRSL） method only using labeled samples was proposed and compared with MRSSL for the same labeled samples. The results demonstrate that MRSSL surpasses MRSL in both model classification performance and model generalization ability. Finally， MRSSL was compared with other semi-supervised learning methods. The results indicate that data augmentation plays an important role on performance improvement， and with stronger inclusiveness for MRI data， MRSSL outperforms others on the knee abnormality classification.

Table and Figures | Reference | Related Articles | Metrics

Select

Turbo decoding algorithm based on linear approximation of correction function

LI Zheng SONG Chun-lin ZHAO Yun-jie WU Zhu-jia

Journal of Computer Applications 2012, 32 (08): 2113-2115.

Abstract （806）

PDF （569KB）（336）

Save

As the new generation communication system, LTE/LTE-A requires reliable communication of higher standard for its high-throughput characteristic. Among those decoding algorithms of Turbo, Log-MAP algorithm, as a simplified algorithm, has a good performance, but its high complexity and delay is still a big problem; Max-Log-MAP algorithm with low complexity could not achieve a good performance as the Log-MAP algorithm. This paper proposed an improved Turbo decoding algorithm based on a linear approximation of the correction function. The improved algorithm adopted different correction fitting parameters for different regions. The simulation results demonstrate that, compared with the existing algorithms, this improved algorithm can achieve the same Bit Error Rate (BER) performance as the Log-MAP algorithm and effectively reduce the decoding delay. More importantly, the proposed algorithm is better for implementation.

Reference | Related Articles | Metrics